Skip to content

REF: use BlockManager.apply in csv code #36150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 12, 2020

Conversation

jbrockmendel
Copy link
Member

This doesn't get rid of internals usage entirely, but makes that access 1 layer less deep

@@ -588,7 +588,7 @@ def astype(self, dtype, copy: bool = False, errors: str = "raise"):

# use native type formatting for datetime/tz/timedelta
if self.is_datelike:
values = self.to_native_types()
values = self.to_native_types().values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.to_numpy() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, to_native_types returns a Block

@@ -2403,7 +2404,8 @@ def fillna(self, value, **kwargs):
def to_native_types(self, na_rep="NaT", **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you now type all of these to return a Block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually adding this makes mypy complain about signature mismatch, rather not futz with it ATM

@jreback jreback added Internals Related to non-user accessible pandas implementation IO CSV read_csv, to_csv labels Sep 6, 2020
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A small comment about moving the apply call into managers.py

Comment on lines 324 to 331
res = mgr.apply(
"to_native_types",
na_rep=self.na_rep,
float_format=self.float_format,
decimal=self.decimal,
date_format=self.date_format,
quoting=self.quoting,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: add a BlockManager.to_native_types, and put the above lines in there. That makes it easier to override this (that's what I am doing in the ArrayManager POC, and was thinking to split that out)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that fits perfectly in this PR

@jreback
Copy link
Contributor

jreback commented Sep 11, 2020

@jbrockmendel needs a rebase, but I agree @jorisvandenbossche suggsetion might be good in this PR

@jbrockmendel
Copy link
Member Author

just rebased. i encourage @jorisvandenbossche to push the edit he has in mind

@jreback jreback merged commit c104622 into pandas-dev:master Sep 12, 2020
@jreback
Copy link
Contributor

jreback commented Sep 12, 2020

thanks @jbrockmendel

yeah agree that followups to push down to_native_types makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants